Comment on "Can one predict DNA Transcription Start Sites by Studying Bubbles? 
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Recently, van Erp et al. published an article [lj pre- 
senting conclusions which contradict earlier works by us 
0, El • We believe this criticism to be misguided - indeed 
the authors conclusions largely substantiate the original 
discovery. This needs to be clarified so that the broader 
community is not mislead. 

In our earlier work we provided experimental and the- 
oretical evidence that functionally important sites for 
transcription can coincide with thermally induced open- 
ings of double-stranded DNA. The comment by van Erp 
et al. regarding the veracity of our experiments is with- 
out foundation. The theoretical basis for these stud- 
ies was provided by the simple Peyrard-Bishop-Dauxois 
(PBD) model. Using parameters already established in 
the literature, we performed Langevin simulations on 
the PBD model for a few specific viral DNA sequences. 
The object of our simulations was to establish whether 
there were certain regions in these highly heterogeneous 
real sequences that were more prone to sustain large 
thermally induced openings ('bubbles') of the double 
stranded molecule. Our simulations indicated that there 
indeed were such regions, and we experimentally verified 
their existence using the SI nuclease technique. Based 
on these combined theoretical and experimental findings 
we made the observation that in these viral sequences 
the regions sustaining the large bubbles coincided with 
known binding sites active in transcription events. These 
included, but were not limited to the transcription initia- 
tion site itself. Based on these observations, we concluded 
that it might be possible more generally to identify DNA 
functionally active sites, including transcription initia- 
tion sites, by studying the thermal fluctuations (partic- 
ularly large amplitude coherent openings) of the double 
strand. It is very important to note that in fact these 
observations and speculations could have been made en- 
tirely based on experimental evidence. We were, how- 
ever, fortunate to also possess a model (PBD) which suf- 
ficiently contains essential entropic ingredients to accu- 
rately guide the location of openings. 

The claim of van Erp et al. is that our simulation 
technique was inadequate and therefore the entire work 
is flawed. As in our own independent elaborations Q 
of our original discovery, van Erp et al. assume that 
thermodynamic equilibrium averages are sufficient and 
perform explicit integrations of the integrals involved in 
the partition function for the above model. It is there- 
fore correct that their and our |4| results with respect 
to obtaining thermodynamic averages are more accurate 
than our initial results. However, the two crucial aspects 
of our initial findings are confirmed by these studies of 


thermodynamical equilibrium properties: 

1. In all three of the viral sequences examined by van 
Erp et al. they find the regions which sustain large 
bubbles to be precisely those we originally identi- 
fied [1,01, an d these regions indeed include the sites 
active during transcription. 

2. van Erp et al. find that the two base pair mutation 
of the AAVP5 promoter causes a significant sup- 
pression of large thermal fluctuations at the former 
transcription sites, exactly as we reported earlier 
0,0 to be the case. 

We emphasize again that these observations were 
strongly supported by experiments (notably missing in 
the work of van Erp et al.). It is indeed true that our 
subsequent studies |4j have found that the relevant quan- 
tities for function is likely to be the probability of bubbles 
of specific sizes - presumably associated with physical di- 
mensions of, e.g., transcription machinery - but this does 
not dilute our primary discovery. 

Van Erp et al. make one valid point in regard to our 
earlier work: We published results on a "control" se- 
quence, which consisted of a non-promoter containing a 
similar number of base pairs as the promoter sequences. 
The results we showed for this case were unfortunate as 
they indicated that no bubbles occurred in this sequence. 
More accurate Langevin results do show the occurrence 
of bubbles in this sequence, as noted by van Erp et al. 
This may indicate that a human coding gene was a poor 
choice for the control sequence, and that our scenario is 
best suited to viral sequences or, again that the specific 
bubble sizes are key. In any case this does not affect the 
validity of our original results for the active promoters. 

In conclusion the results of van Erp et al. in essence 
confirm our initial discovery: the PBD model is in- 
deed an accurate guide to the location of experimentally- 
identified active openings. Clearly this simple model 
must be further augmented to describe either fully re- 
alistic dynamics, or how biological machinery (such as 
RNA polymerase) engages these active regions. We are 
working to implement these augmentations. 
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